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ABSTRACT 

Optimal  spectral  decomposition  (OSD)  is  applied  to  ocean  data  assimilation  with  variable  (temperature, 
salinity,  or  velocity)  anomalies  (relative  to  background  or  modeled  values)  decomposed  into  generalized 
Fourier  series,  such  that  any  anomaly  is  represented  by  a  linear  combination  of  products  of  basis  functions  and 
corresponding  spectral  coefficients.  It  has  three  steps:  1)  determination  of  the  basis  functions,  2)  optimal  mode 
truncation,  and  3)  update  of  the  spectral  coefficients  from  innovation  (observational  increment).  The  basis 
functions,  depending  only  on  the  topography  of  the  ocean  basin,  are  the  eigenvectors  of  the  Laplacian  operator 
with  the  same  lateral  boundary  conditions  as  the  assimilated  variable  anomalies.  The  Vapnik-Chervonkis  di¬ 
mension  is  used  to  determine  the  optimal  mode  truncation.  After  that,  the  model  field  updates  due  to  innovation 
through  solving  a  set  of  a  linear  algebraic  equations  of  the  spectral  coefficients.  The  strength  and  weakness  of  the 
OSD  method  are  demonstrated  through  a  twin  experiment  using  the  Parallel  Ocean  Program  (POP)  model. 


1.  Introduction 

Data  assimilation  is  required  for  operational  ocean 
studies  and  maneuvers  (Sun  1999),  and  has  contributed 
significantly  to  the  success  of  ocean  modeling  and  pre¬ 
diction.  In  a  numerical  ocean  model,  a  single  variable  or  all 
the  model  variables  c  (no  matter  two-  or  three-dimensional) 
can  be  ordered  by  grid  point  and  by  variable,  forming 
a  single  vector  of  length  NP  with  N  as  the  total  number 
of  grid  points  and  P  as  the  number  of  variables.  For 
multiple  model  variables,  nondimensionalization  is 
conducted  before  forming  a  single  vector  c.  The  ex¬ 
isting  data  assimilation  is  to  blend  modeled  (or  back¬ 
ground)  fields  (c,!,)  (usually  on  the  grid  points)  with 
observational  data  (cD)  (usually  not  at  the  grid  points) 
of  any  ocean  variable  (Cohn  1997;  Tang  and  Kleeman 
2004;  Chu  et  al.  2004b;  Galanis  et  al.  2006;  Lozano  et  al. 

AUi  1996), 

AU2  c„  =  cfe+w[c0-H(cft)]-  (!) 
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to  represent  the  (unknown)  “truth”  cr  with  an  analysis 
error, 


Here,  c„  is  the  assimilated  field  (analysis  field);  H  is  an 
operator  that  provides  the  model’s  estimate  at  the  ob¬ 
servational  points;  W  is  the  weight  matrix;  and  d  =  [cD  — 

H  (c/j)]  is  the  innovation  (observational  increment) 

(Fig.  1).  Various  data  assimilation  schemes  such  as  op-  [HI 
timal  interpolation  (OI),  Kalman  filter,  and  variational 
method  [three-  and  four-dimensional  variational  data 
assimilation  (3DVAR  and  4DVAR)]  were  developed, 
and  given  unified  notation  by  Ide  et  al.  (1997).  Their  IAU3I 
differences  are  the  different  ways  to  determine  the  weight 
matrix  W.  For  example,  minimization  of  the  cost  function 
in  the  OI  gives  the  weight  matrix  (e.g.,  Bretherton  et  al. 

1976;  Lozano  et  al.  1996),  IAU4I 

W  =  BHT(HBHT  +  R)  (3) 

The  minimization  of  the  analysis  error  covariance  (P)  in 
the  Kalman  filter  (Galanis  et  al.  2006)  leads  to 

w,-  =  (4) 
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Fig.  1.  Illustration  of  ocean  data  assimilation  with  c*  located  at 
the  grid  points  and  c„  located  at  the  points  The  ocean  data 
assimilation  is  to  convert  the  innovation,  d  =  c0  ~  H(cfc),  from  the 
observational  points  to  the  grid  points. 

Here,  B  and  P  are  the  background  error  covariance 
matrices,  where  P'  is  also  called  the  forecast  projection 
matrix  by  some  authors;  R,  is  the  observational  error 
covariance  matrix;  and  t  is  time.  Despite  some  differ¬ 
ences  in  formality,  (3)  and  (4)  are  identical.  The  most 
significant  challenge  for  the  existing  data  assimilation 
methods  is  the  determination  of  the  background  error 
covariance  matrix  B  (or  forecast  projection  matrix  P^) 
for  the  OI  and  3DVAR  (or  Kalman  filter),  since  B  and 
pf  are  enormous  matrices  that  are  difficult  to  estimate 
due  to  the  following  characteristics:  uncertain  tunable 
parameters,  inhomogeneous  and  anisotropic  structures, 
and  complex  boundaries  in  oceans. 

In  standard  OI  the  covariance  of  a  field  between  two 
events  (space,  time)  or  between  a  field  at  a  grid  location 
and  an  observation  are  prescribed  from  some  general 
considerations  on  the  nature  of  the  covariances.  These 
covariances  can  be  converted  to  their  equivalent  repre¬ 
sentations  in  spectral  space.  Oceanographers  have  con¬ 
structed  B  to  include  inhomogeneities  and  anisotropies 
associated  with  the  presence  of  topography,  and  to  reflect 
in  a  way  the  adaptation  of  the  ocean  fields  to  the  topog¬ 
raphy.  Utilization  of  ocean  topography  may  change  the 
weighting  operation,  W[cG  —  H ( C/, )  ]  in  (1).  into  a  mathe¬ 
matical  operator,  T[c0  —  H(c/,)],  that  maps  the  innovation 
(at  the  observational  points)  directly  onto  the  grid  points, 

-  H(Cft)]  =  -F[AcJ,  Ac,  =  c0  -  H(cft), 

(5) 


where  H  could  be  different  in  (1)  and  (5)  when  vertical 
interpolation  is  involved.  The  difference,  A c  =  c  —  C/,.  is 
called  the  anomaly  (relative  to  the  background  field)  of 
c.  Very  early  in  the  application  of  OI  to  ocean  fields, 
Bretherton  et  al.  (1976)  explored  the  use  of  spectral  AU5 
representation  of  functions  defined  on  a  grid  instead  of 
field  values  defined  on  a  grid  was  considered.  Along 
their  path,  the  optimal  spectral  decomposition  (OSD) 
was  developed  to  apply  the  spectral  method  to  field 
values,  that  is,  to  perform  as  such  an  operator  T  with  the 
eigenvectors  of  the  Laplacian  operator  as  the  basis 
functions  that  only  depend  on  the  topography,  satisfy 
the  same  boundary  conditions  as  the  assimilated  ocean 
variable  anomalies  (e.g.,  temperature,  salinity,  velocity), 
and  are  predetermined  before  the  data  assimilation. 

Although  the  relative  simplicity  of  an  atmospheric 
spherical  shell  in  comparison  to  the  complexity  of  oce¬ 
anic  basins  might  explain  the  limited  use  of  spectral 
models  for  the  ocean,  the  OSD  has  been  proven  as  an 
effective  ocean  data  analysis  method.  With  it,  several 
new  ocean  phenomena  have  been  identified  from 
observational  data,  such  as  a  bimodal  structure  of 
chlorophyll-n  with  winter/spring  (February-March) 
and  fall  (September-October)  blooms  in  the  Black  Sea 
(Chu  et  al.  2005b),  fall-winter  recurrence  of  current 
reversal  from  westward  to  eastward  on  the  Texas- 
Louisiana  continental  shelf  from  the  current  meter, 
a  near-surface  drifting  buoy  (Chu  et  al.  2005a),  prop¬ 
agation  of  long  Rossby  waves  at  middepths  (around 
1000  m)  in  the  tropical  North  Atlantic  from  the  Argo 
float  data  (Chu  et  al.  2007),  and  temporal  and  spatial 
variability  of  global  upper-ocean  heat  content  (Chu 
2011)  from  the  data  of  the  Global  Temperature  and 
Salinity  Profile  Program  (GTSPP;  Sun  et  al.  2009). 
However,  the  OSD  method  has  not  yet  been  used  for 
ocean  data  assimilation. 

The  purpose  of  this  paper  is  to  extend  the  use  of  OSD 
from  ocean  data  analysis  to  ocean  data  assimilation.  The 
OSD  can  be  either  three-  or  two-dimensional.  However, 
it  is  conducted  in  a  horizontal  plane  (i.e.,  two-dimensional 
OSD)  in  this  study.  The  rest  of  the  paper  is  organized  as 
follows.  Section  2  discusses  the  lateral  boundary  condi¬ 
tions.  Sections  3  describes  the  generation  of  basis  functions. 
Section  4  presents  variables  at  grid  versus  observational 
points.  Section  5  shows  the  determination  of  spectral 
coefficients  from  minimization  of  combined  observa¬ 
tional  and  analysis  errors.  Section  6  illustrates  the  mode 
truncation  as  a  statistical  learning  process  using  the 
Vapnik-Chervonenkis  (VC)  dimension.  Section  7  shows 
the  ocean  model  with  the  OSD  data  assimilation  pro¬ 
cedure.  Sections  8  and  9  describe  a  twin  experiment  and 
error  statistics  of  the  OSD  data  assimilation.  Section  10 
presents  the  conclusions. 
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2.  Lateral  boundary  condition 

Let  (x,  z)  be  the  horizontal  and  vertical  coordinates, 
respectively;  and  R(z)  be  the  area  bounded  by  the  lateral 
boundary  T(z).  The  anomaly  Ac  satisfies  the  generalized 
homogeneous  lateral  boundary  (T)  condition  (see  the 
appendix  for  a  detailed  explanation), 

^(rjn  ■  V(Ac)  +  £>2(t)Ac  =  0,  (6) 

where  V  =  id/dx  +  j  dldy  is  the  horizontal  gradient  oper¬ 
ator  with  (i,  j)  as  the  unit  vectors  in  the  horizontal  plane; 
n  is  the  unit  vector  normal  to  the  boundary;  r  denotes 
a  moving  point  along  the  boundary;  and  [£>i(r),  bilj)] 
are  parameters  varying  with  r.  The  boundary  condition 
(6)  becomes  the  Dirichlet  boundary  condition  when 
bi  =  0  and  the  Neumann  boundary  conditions  when  b2  =  0. 
It  is  noted  that  different  variable  anomalies  have  dif¬ 
ferent  [£>i(t),  £>2(1")].  For  example,  the  temperature,  sa¬ 
linity,  and  velocity  potential  anomalies  have  £>2  =  0  for 
the  rigid  boundary  and  £>!  =  0  for  the  open  boundary. 
However,  the  streamfunction  anomaly  has  £q  =  0  for  the 
rigid  boundary  and  £>2  =  0  for  the  open  boundary. 


3.  Basis  functions 


a.  Three  necessary  conditions 

Selection  of  basis  functions  {<bk(x,  z)}  needs  to  satisfy 
three  necessary  conditions:  (i)  satisfaction  of  the  same  ho¬ 
mogeneous  boundary  condition  (6)  of  the  assimilated  var¬ 
iable  anomaly,  (ii)  orthonormal,  and  (iii)  independence  of 
the  assimilated  variable.  The  second  necessary  condition  is 
given  by 

0,.(x,z)0/l,(x,z)dx  =  S,,,,  (7) 

JJ«U) 

where  8kkk  is  the  Kronecker  delta,  defined  as 


fo  if  k±U 

kk'  \l  if  k  =  k!  ' 


(8) 


Because  of  the  independence  of  the  assimilated  variable 
(the  third  necessary  condition),  the  basis  functions  are 
available  prior  to  the  data  assimilation. 

Lise  of  the  eigenvectors  of  the  horizontal  Laplacian  op¬ 
erator  as  the  basis  functions  is  an  effective  and  easy  way  to 
get  the  basis  functions  that  satisfy  the  three  necessary  con¬ 
ditions.  The  eigenvectors  {<f>k}  of  the  horizontal  Laplacian 
operator  are  the  solutions  of  the  Poisson  equation, 


V20,  =  [VT)n  ■  \<t>k  +  b2(r)<f>k]\r  =  0, 


k=l  OO 


(9) 


ET  AL.  3 

Here,  {Ayt}  are  the  eigenvalues,  and  n  is  the  unit  vector 
normal  to  the  lateral  boundary.  It  is  noted  that  these  ei¬ 
genvectors  (c bk }  satisfy  the  three  necessary  conditions: 
(i)  satisfaction  of  the  same  homogeneous  boundary  condi¬ 
tion  (9)  as  the  assimilated  variable  anomaly,  (ii)  ortho¬ 
normal,  and  (iii)  independent  of  the  assimilated  variables. 
The  features  (i)  and  (iii)  distinguish  the  eigenvectors 
{<bk\  from  the  commonly  used  empirical  orthogonal 
functions  (EOFs)  in  ocean  data  assimilation  (e.g.,  Pham 
et  al.  1998).  The  EOFs  depend  on  the  assimilated  vari¬ 
ables  and  do  not  satisfy  the  same  homogeneous  boundary 
condition  (9)  as  the  assimilated  variable  anomalies. 

Because  of  irregular  lateral  boundaries,  the  basis 
functions  { <f>k }  are  usually  numerical  solutions  of  (9), 
{0/t(x„,  z)}.  Here,  x„  =  (xh  yj),  n  =  1,  2,  . . .  ,  N,  repre¬ 
senting  the  horizontal  grid  points.  From  now  on,  the 
vertical  coordinate  z  is  omitted  for  simplicity.  The  first  K 
discrete  basis  functions  for  all  grid  points  are  repre¬ 
sented  by  the  following  matrix: 


G  =  iSnk)  = 

'  (X|  ) 
<t>2  (Xl) 

01  (x2)  . 

02  (x2) 

•  ^(xNy 

^2  (X7v) 

_<M*  1) 

0A-(x2)  . 

- 1 

-e- 

b.  Example 

With  the  NOAA  National  Geophysical  Data  Center’s 
Digital  Bathymetry  Data  Base  with  5'  X  5'  resolution 
(ETOP05),  the  basis  functions  (fk(xn)  (k  =  1, . . . ,  K)  at 
a  certain  depth  z  are  computed  for  the  Pacific  Ocean.  In 
assimilating  temperature  observations,  the  temperature 
anomaly  Ac  satisfies  the  Dirichet  boundary  condition 
(£»!  =  0)  at  the  southern  boundary  (Antarctica)  and  the 
Newmann  boundary  condition  (£>2  =  0)  elsewhere  (rigid 
boundary).  Figure  2  shows  the  first  12  basis  functions  [F2] 
{4>k}  for  the  Pacific  Ocean  at  the  surface  for  illustration. 

The  first  basis  function  0,  (x„)  shows  the  latitudinal 
variability.  The  second  basis  function  02(x„)  shows  the 
dipole  pattern  of  zonal  variability  with  opposite  signs  in 
the  eastern  Pacific  (negative)  and  the  western  Pacific 
(positive).  The  third  basis  function  03(x„ )  shows  the 
slanted  dipole  pattern  with  opposite  signs  in  the  north¬ 
eastern  Pacific  (positive)  and  the  southwestern  Pacific 
(negative).  The  fourth  basis  function  04(x„)  shows  the 
tripole  pattern  with  negative  values  in  the  western  and 
eastern  Pacific  and  positive  values  in  between.  The 
higher-order  basis  functions  have  more  complicated 
variability  structures.  Some  features  are  quite  similar  to 
the  recently  described  global  thermal  structure  (e.g.,  Chu 
2011).  It  may  imply  the  topographic  effect  (at  least  par¬ 
tially)  on  the  horizontal  variability  such  as  temperature. 
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Fig.  2.  First  12  r-type  basis  functions  { tf>k ,  k  =  1,  . . .  ,  12}  for  the  Pacific  Ocean  at  the  surface. 
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salinity,  density,  and  velocity  potential  (Song  et  al. 
2001). 

4.  Grid  versus  observational  points 

Let  variable  c  have  M  observations  c„(x("'\  f)  at  loca¬ 
tion  x(m>  (m  =  1,2,...,  M)  indicated  by  a  superscript  (in), 
and  have  background  field  cb(xm  t)  at  grid  points  x„.  The 
H  operator  in  (1)  interpolates  the  background  (modeled) 
field  (cfc)  from  grid  to  observational  point  x(m\ 


N 


with 


cfe(x(m),0=  1  hmncb(xn,t) 

n—1 


N 

X  Km  =  1- 
n=  1 


(11) 


(12) 


which  is  called  the  data  influence  matrix.  It  is  noted  that 
both  matrices  F  and  H  depend  solely  on  the  location  of 
the  observational  points  [x^]. 


5.  OSD  data  assimilation  equation 

The  observational  error  (ec)  at  grid  point  x„  is  given  by 
eo(x„,0  =  cQ(x„,f)  -  ct(xn,t).  (18) 

The  components  of  the  vector  D  represent  the  differ¬ 
ence  (observation  minus  background  values)  at  the  grid 
points, 


The  innovation  d>"1’  is  then  given  by 

d(m)(x(m),  t)  =  co{^m\t)  -  £  hmncb(xn,t) .  (13) 

n= 1 

Let  H  =  [hmn]  be  an  M  X  N  matrix.  Distribution  of  all 
innovations  cfm\x(m))  ( m  =  1,  2,  ...  ,  M )  from  the  obser¬ 
vational  points  into  the  grid  points  is  represented  by  the 
same  proportionality  coefficient.  The  mean  adjustment  at 
grid  point  x„  due  to  all  the  observations  is  given  by 


/A  =  co(v0-cfi(x„,0- 

Its  spectral  form,  Dn,  is  represented  by 


Df]  =  £ 

k  =  1 


(19) 


(20) 


where  K  is  the  mode  number  of  the  optimal  truncation 
(see  section  6).  The  assimilated  field  with  the  given  K  is 
represented  by 


D  = 


M 

X  KJ(m) 

m  =  1 


M 


>  /„=  I  K 


(14) 


m  =  \ 


where  f,  denotes  observational  data  influence  at  grid  point 
x„.  The  larger  the  value  of  the  larger  the  observational 
influence  on  that  grid  point.  Let  d  be  the  M-dimensional 
innovation  vector  and  let  D  be  its  distributed  N- 
dimensional  vector  on  the  grid  points, 

dT  =  (d«  (P\  ...  ,4^),  Dt  =  (Dv d2,  ...  ,Dn), 

(15) 

where  the  superscript  T  indicates  the  transpose.  Note 
that  (14)  can  be  written  in  matrix  form, 


c^M  =  cbM+fnD^. 

The  difference  between  D„  and  is  given  by 
fn(Dn-D^)  =  co(xn,t)-cf\xn,t). 
Substitution  of  (2)  and  (18)  into  (22)  leads  to 
f„(Dn  ~  D<P)  =  \c0(xnJ)  -  ct(xnJ)}  -  [cf  )(x„,  t) 
~ct(xnJ)\  =  eo-B(f>^8(K), 


(21) 


(22) 


(23a) 


which  contains  the  observational  error  (e„)  and  the 
analysis  error  for  a  given  K  [ejp]  at  grid  point  x„.  It  is 
noted  that  sc  contains  the  instrumentation  error,  es¬ 
pecially  those  associated  with  remote  sensing  obser¬ 
vations,  and  the  representativeness  error,  that  is,  for 
example,  the  mismatch  of  the  point  observation  with  the 
ocean  model  resolution.  The  combined  observation- 
analysis  error  variance  over  the  whole  domain  is  defined 
by 


FD  =  HTd, 


and  F  is  an  IV  X  N  diagonal  matrix. 


(16) 


N 


4  = 


I  Un(Dn  ~  D{nK))f 

n= 1 

N- 1  ‘ 


(23b) 


/i  0  0 

0  f2  0 

0  0  '■ 
0  0  0 


0 

0 

0 

4 


(17) 


Minimization  of /lr  after  substituting  (19)  and  (20)  into  (23b), 

2\ 


1 


dak  N  —  1  dak 


N 


n= 1 


K 


i  =°’ 


k=l 
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leads  to  a  set  of  algebraic  equations  for  the  spectral 
coefficients  {ak}, 


X 

k= 1 


N 


X  <MX„)/,A(X„) 

n—1 


N 


ak=  1  <t>l(xn)f„Dn 
n—1 


1  =  1,2,  ...  ,K,  (24) 

which  is  rewritten  into  a  matrix  form  after  using  (10), 

GFGtA  =  GFD,  (25) 

where  A  is  the  spectral  coefficient  vector,  AT  =  (ax, 
a2,  ■  ■  .  ,  «k)-  The  solution  of  (25)  is  given  by 


A=  [GFGt]  ‘GFD. 


(26) 


Then  (21)  is  written  into  the  matrix  form  after  using  (16), 
c  =  c,  +  FGT[GFGT]  1E.  E  =  GFD  =  GHTd,  (27) 


which  is  called  the  OSD  data  assimilation  equation.  The 
vector  E  denotes  the  observational  innovation  projected 
into  the  spectral  space.  The  matrix  form  of  the  OSD  data 
assimilation  is  quite  similar  to  the  existing  ocean  data 
assimilation  schemes  (Of  and  the  Kalman  filter  with  the 
matrix  B  replaced  by  F  and  H  by  G  [see  (3)  and  (4)]}. 
The  two  matrices  B  and  F  play  a  similar  role  that  make 
the  analysis  field  compact  in  the  observational  data-rich 
area.  It  is  also  noted  that  the  OSD  data  assimilation  [see 
(27)]  is  applied  to  one  vertical  level  z.  As  such,  the  data 
assimilation  may  distort  the  vertical  stratification.  Re¬ 
cently  developed  fully  conserved  minimal  adjust 
schemes  (Chu  and  Fan  2010;  Wang  et  al.  2012)  can  be 
used  to  stabilize  the  vertical  stratification. 


6.  Mode  truncation  using  the  V apnik-Chervonenkis 
dimension 

The  assimilation  results  depend  on  the  mode  trunca¬ 
tion  ( K ),  since  the  spectral  coefficients  («| ,  ■  ■  •  ,  u/<)  are 
determined  on  the  base  of  minimization  of  the  com¬ 
bined  observation-analysis  error  variance  /lr  [see  (23b)] 
for  the  given  K.  With  the  calculated  spectral  coefficients 
(flj, . . . ,  aK)  based  on  observational  data,  the  assimilated 
field,  c[K\xn,  t ),  can  be  calculated  at  any  grid  point  (x„) 
using  (27),  and  the  analysis  error  is  estimated  by  [see  (2)] 

ef)(x„-0  =  eft(xn,0+/n  X  ak{t)<t>k(xn), 

k= 1 

eb(Xn’0  =  Cb(ltn,t)-Ct(Xn,t),  (28) 

where  sb(xn,  t)  is  the  model  (or  background)  error.  It  is 
noted  from  (28)  that  reduction  of  the  model  (or 


background)  error  sb  (i.e.,  smaller  rJaK>)  is  achieved  by 
the  observational  innovation  using  OSD  (second  term 
on  the  right-hand  side). 

Since  the  “true”  field,  cr(x„,  t ),  is  still  uncertain,  the 
analysis  error  efaK>  should  be  estimated  probabilistically. 

In  the  spectral  decomposition  method,  the  observation 
space  and  the  model  space  are  projected  into  the  spectral 
space.  There  is  a  need  to  ensure  that  the  size  of  the  spec¬ 
tral  space  is  adequate  for  these  two  purposes.  The  spectral 
representation  acts  as  a  spatial  low-pass  filter  for  the  fields, 
where  the  highest  allowed  wavenumbers  relate  to  the 
highest  spectral  eigenvalues.  The  observational  network  is 
required  to  provide  information  without  aliasing.  For  ex¬ 
ample,  in  an  eddy  field  in  the  deep  ocean,  one  expects  that 
the  basis  functions  can  resolve  well  features  of  the  size  of 
the  Rossby  radius  of  deformation.  Thus,  the  ratio  of  ob¬ 
servational  points  ( M )  and  the  spectral  truncation  (K)  is 
a  key  to  determining  the  optimal  mode  truncation  Kopt.  It 
is  noted  that  c*(x,„  f),  («i, ....  aK ),  and  {<fik,  k  =  1, . . .  ,K} 
are  given.  Let  J  be  the  ensemble  average  of  analysis  error 
variance.  The  probability  for  the  upper  bound  of /is  given 
by  (Vapnik  2000;  Chu  et  al.  2003x)  AU6  IAU7I 


[\n(2M/K)  +  1]  -ln(V4)l  „ 

- ME - }=1~TJ’ 

(29) 

where  the  mode  truncation  K  is  treated  as  the  VC  di¬ 
mension  and  17  (<1)  is  the  significance  level.  Term  /*  is 
the  upper  bound  of  /tr.  The  minimization  of  the  VC  cost 
function  ( JK ), 


P{J<Jlx+J 


JK  =  JtT  +  M ’  Tl)’  P(K'  M ’  v) 
=/*1 


[ln(2 M/K)  +  1]  -  In (tj/M) 


MIK 


K  =  1,2, 


00 

•>  •> 

(30) 


leads  to  another  set  of  spectral  coefficients  . . . ,  afy. 
It  is  noted  that  for  a  given  M ,  /,r  decreases  mono- 
tonically  with  K  and  that  /jl  increases  with  K  if  17  is  given 
(17  =  0.1  in  this  study).  Thus,  JK  has  a  minimum  value  for 
certain  mode  number  Kopt, 


min  (jk)  =  JK  ■  (31) 

K  opt 


7.  Ocean  modeling 

a.  Model  description 

The  Parallel  Ocean  Program  (POP)  model  is  used  to 
show  the  feasibility  of  the  OSD  data  assimilation.  Within 
the  framework  of  the  Community  Earth  System  Model 
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Level 

Depth 

Level 

Depth 

Level 

Depth 

Level 

Depth 

Level 

Depth 

1 

5 

13 

125 

25 

268 

37 

708 

49 

2649 

2 

15 

14 

135 

26 

285 

38 

787 

50 

2890 

3 

25 

15 

145 

27 

305 

39 

879 

51 

3133 

4 

35 

16 

155 

28 

328 

40 

985 

52 

3380 

5 

45 

17 

165 

29 

351 

41 

1106 

53 

3628 

6 

55 

18 

175 

30 

378 

42 

1245 

54 

3876 

7 

65 

19 

186 

31 

409 

43 

1401 

55 

4126 

8 

75 

20 

198 

32 

443 

44 

1574 

56 

4375 

9 

85 

21 

210 

33 

483 

45 

1764 

57 

4625 

10 

95 

22 

223 

34 

528 

46 

1969 

58 

4875 

11 

105 

23 

236 

35 

579 

47 

2186 

59 

5125 

12 

115 

24 

251 

36 

639 

48 

2414 

60 

5375 

(CESM),  the  POP  is  a  time-dependent,  level-coordinate 
primitive  equation  ocean  general  circulation  model 
rendered  on  a  three-dimensional  grid  that  includes 
a  free  surface  and  realistic  topography  (Smith  et  al. 
AU8I  2010).  The  B  grid  is  used  for  the  spatial  discretization. 

Derived  from  the  Bryan-Cox-Semtner  class  of  models, 
AU9I  the  POP  (Dukowicz  and  Smith  1994)  was  officially 
adopted  as  the  ocean  component  of  the  CESM  based  at 
NCAR  in  2001.  It  has  an  implicit  free  surface  and  general 
IAU10I  orthogonal  coordinates  (Smith  et  al.  1995).  It  is  a  global 
model,  with  the  grid  defined  so  that  the  pole  is  located  in 
Greenland.  Since  the  purpose  of  this  study  is  to  show 
the  feasibility  of  the  OSD  data  assimilation  rather  than 
to  simulate/predict  the  real  ocean  processes,  a  low- 
horizontal-resolution  (3°),  60-vertical-level  (Table  1) 
version  of  the  model  with  a  time  step  of  2  h  is  used  in  this 
study.  In  the  top  175  m,  the  model  has  30  levels  with  10  m 
between  each  of  the  consecutive  levels.  The  discretized 
model  variable  at  the  grid  points  is  represented  by  c(x„,  zi , 
f),  n  =  1,  2,  ...  ,  TV/,  1  =  1,  2,  ...  ,  L.  Here,  N/  is  the  total 
number  of  the  horizontal  grid  points  at  the  vertical  level 
/  and  L  =  60  is  the  total  number  of  the  vertical  levels. 

The  atmospheric  forcing  at  the  surface  is  provided  by 
an  annually  varying  climatology  derived  from  the  sur¬ 
face  Co-ordinated  Ocean-Ice  Reference  Experiments 
(CORE),  version  2  (Large  and  Yeager  2009).  The  air- 
sea  fluxes  of  momentum,  heat,  freshwater,  and  their 
components  have  been  computed  globally  from  1948  at 
frequencies  ranging  from  6  hourly  to  monthly.  All  fluxes 
are  computed  over  the  23  years  from  1984  to  2006,  but 
radiation  prior  to  1984  and  precipitation  before  1979  are 
given  only  as  climatological  mean  annual  cycles.  The 
AUlil  input  data  are  based  on  NCEP  reanalysis  for  the  surface 
vector  wind,  temperature,  specific  humidity,  and  density,  and 
on  a  variety  of  satellite  based  radiation,  sea  surface  tem¬ 
perature,  sea  ice  concentration,  and  precipitation  products 
(from  https://climatedataguide.ucar.edu/climate-data/large- 
yeager-air-sea-surface-flux-corev2-1949-2006).  The  model 
simulations  for  this  experiment  used  climatological 


forcing,  (daily  23-yr  average).  The  forcing  is  in¬ 
terpolated  to  the  time  step  of  the  model. 

The  POP  model  has  been  spun  up  from  rest  and  clima¬ 
tological  annual  mean  (temperature  and  salinity)  with  the 
daily  climatological  surface  forcing  from  the  CORE,  ver¬ 
sion  2  (Large  and  Yeager  2009),  and  integrated  for  a  period 
of  over  300  simulation  years.  The  model  output  for  the  year 
300  (c30o)  is  treated  as  the  “truth  field,”  cr(x„,  zi ,  f). 

b.  Initial  error 

Although  we  are  using  a  global  model,  temperature 
“observations”  are  only  incorporated  for  the  Pacific  basin. 

It  is  noted  that  use  of  single-variable  (i.e.,  temperature) 
data  is  not  ideal,  since  observational  temperature  (7),  sa¬ 
linity  (.S'),  and  velocity  (Y)  data  should  be  assimilated  to 
keep  dynamic  balance,  since  ( T ,  S,  V)  are  the  dependent 
variables  in  ocean  models.  Chu  (2006)  shows  that  assimi¬ 
lation  with  (  /’,  .S)  data  only  introduces  dynamic  imbalance 
and  suggests  that  geostrophic  velocity  corresponding  to 
the  ( T ,  S )  data  should  also  be  assimilated.  The  results  of 
this  study  are  only  used  for  the  preliminary  evaluation. 

The  model  is  integrated  from  1  March  of  year  210  and 
uses  observations  sampled  from  the  fields  from  1  March 
of  year  300.  The  initial  error  (the  variable  c  denoting 
temperature)  is 

e(v  Z/’  ?o)  =  C2to(X«’  ?o)  “  C30t)(Xn’  */’  fo)  ■  (32) 

The  temperature  at  the  surface  initially  has  maximum 
errors  (i.e.,  the  mismatch  between  years  210  and  300), 
such  as  +2°C  in  the  Southern  Ocean  near  the  Antarctic 
and  — 2°C  north  of  the  Kuroshio  Extension;  medium 
errors,  such  as  +1°C  in  the  central  equatorial  Pacific; 
and  low  errors  (|e|  <  0.5°C)  in  subtropical  areas  in 
both  hemispheres  (Fig.  3a).  The  temperature  initially  [F3] 
has  smaller  errors  at  1106-m  depth  (level  41)  with 
maximum  errors  in  the  circumpolar  currents  near  +1°C 
in  the  west  and  —  1°C  in  the  east,  and  low  errors 
(|e|  <  0.5°C)  elsewhere  (Fig.  3b).  The  model  without 
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Fig.  3.  Initial  errors  in  temperature  (°C)  at  (a)  the  sea  surface  and  (b)  the  depth  of  1106  m  (at  level  41). 


data  assimilation  is  integrated  from  1  March  with  the 
initial  condition, 


C6(V^o)  =  C21o(Xn’ Vo)’ 


(33) 


using  daily  surface  forcing  for  20  days,  represented  by 

t-non(Xm  2/,  /). 

8.  OSD  data  assimilation 

a.  Bilinear  interpolation 


nXm)Jxi+l-x^)(yj+1-y^) 
WiJ  (xi+i-xi)(yj+i-yj) 
(m)  _(^m)-^)(y;+i-y(w)) 
Wi+1J  (xi+1-xi)(yj+1-yj) 

(m)  Jxi+1-x^wn)-yj) 

Wij+1  (xi+i-xi)(yj+i-yj) 

(m)  _(x(m)-x,.)(y(m)-yy) 

Vi+1J+1  (xi+\-xt)(yj+i-y,)' 


(35) 


Let  the  observational  point  x(m)  located  in  the  grid  cell 
be  Xi  <Xi+i,  yj  <yj+\.  In  this  study,  the  bi-  It  is  noted  that  the  proportionality  coefficients 


linear  interpolation  is  chosen  for  the  H  operator  (Fig.  4), 

\i(r  \  _  _i_  ...(ml  _  i  ...('«) 

n\cb>  Wi,j  Cb(iJ)  +  Wi+l,jCb(i+l,i)  +  Wi,j+lCb(i,j+l) 


Wif,  *P;+ ij,  wj”+  v  w\+\,j+\  I  depend  solely  on  the  lo¬ 
cation  of  the  observational  points  (x-"d),  and 


l  (m) 

+  Wi+lj+lCb(i+lj+l)’ 


where 


(34) 


Wij  +  Wi+lj  +  Wij+ 1  +  Wi+lj+l  ~  1  • 


(36) 


Each  row  of  the  MX  N  matrix  H  =  in  (16)  only  has 
four  nonzero  values, 
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Fig.  4.  Bilinear  interpolation  for  calculating  the  basis  functions 
at  the  observational  point  xm  from  their  values  at  the  four  neigh¬ 
boring  grid  points. 


Other  simple  interpolations  such  as  inverse  distance 
weighting,  spline,  and  trigonometric  polynomials  can 
also  be  used  for  H  matrix. 


b.  Twin  experiment 

A  sampling  pattern  consisting  of  a  data-rich  area  and 
a  data-poor  area  offers  a  challenge  to  the  ability  of  the 
basis  function  to  represent  the  intended  fields  well,  since 
the  projection  of  the  data  onto  the  spectral  fields  is  likely 
to  generate  a  field  defined  in  the  entire  domain.  To  test 
the  ability  of  the  OSD  to  represent  the  intended  fields, 
the  “observational'’  data  are  sampled  from  c3oo  begin¬ 
ning  with  1  March  for  20  days  at  locations  (unchanged 
during  the  data  assimilation  process)  given  by  the  hori¬ 
zontal  distribution  of  the  Argo  floats  in  March  2003 

[F5l  (Fig-  5).  This  produces  the  observational  dataset 

[c0(x(1\  z,  t),  . . . ,  c0(x(M\  z,  /)]  with  a  data-rich  area 
north  of  20°S  and  a  data-poor  area  south  of  20°S.  If  the 
spatial  decorrelation  scale  is  much  less  than  the  domain 
size,  then  the  analysis  fields  using  OI  will  be  compact  in 
the  data-rich  area  (i.e.,  north  of  20°S). 

The  OSD  data  assimilation  process  at  day  =  t  follows 
(27)  with  the  following  procedure:  (i)  determine  the 
optimal  mode  decomposition  (ii)  compute  the 

difference  between  the  observational  and  background 
values  at  the  observational  points  following  (13); 

(iii)  substitute  the  difference  into  (26)  to  obtain  the  spec¬ 
tral  coefficients  [a,(z,  t ),  a2(z,  t ),  ...  ,  aKopt(z,  t)];  and 

(iv)  substitute  the  spectral  coefficients  [fli(z,  t),  a2(z,  t), . . . , 
aK(z,  t0)]  into  (27)  to  get  the  assimilated  initial  condition 
ca(x„,  z ,  /).  The  dependence  of  the  VC  cost  function  (JK) 

[FBI  on  the  VC  dimension  (Fig.  6)  shows  that  the  optimal  mode 

truncation  is  Kopt  =  12  at  125-m  depth  (level  13;  Table  1) 
and  day  0  (for  illustration).  The  assimilation  model  is  then 
run  forward  in  time  for  24  h  with  the  model  field  saved  at 
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Fig.  5.  Daily  sampling  taking  from  the  horizontal  distribution  of 
the  Argo  floats  in  March  2003.  It  is  noted  that  the  observational 
data-rich  area  is  north  of  20°S  and  that  the  observational  data-poor 
area  is  south  of  20°S. 


the  end  of  24  h,  which  is  the  background  held  for  the  day  = 
( t  +  1),  Ci(x„,  z,  t  +  1  day).  At  each  assimilation  time,  the 
optimal  mode  truncation  (7Copi)  is  recalculated.  This  pro¬ 
cess  repeats  for  20  days  and  leads  to  the  assimilated  out¬ 
put,  c„(x„,  z,  t). 

9.  Error  statistics 

The  three  datasets  ca(x„,  zh  0,  cnon(x„,  zh  t),  c,(x„,  zh  0 
(/  =  1, . . . ,  V)  for  the  period  of  20  days  (t0  <  t  <  t\  =  t0  + 
20  days)  are  used  to  show  the  root-mean-square  error 


Fig.  6.  Optimal  mode  decomposition  (Kopt)  at  125-m  depth  and 
day  0  is  determined  by  the  minimization  of  the  VC  cost  function 
(denoted  by  red  square). 
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Fig.  7.  Comparison  between  the  assimilation  and  nonassimilation  runs  of  the  temporally 
varying  basinwide  (a)  RMSE  and  (b)  BIAS. 


(RMSE)  with  and  without  the  OSD  data  assimilation.  The  local  RMSE  and  BIAS  for  the  assimilation  and 
Let  N/  be  the  number  of  the  horizontal  grid  points  at  the  nonassimilation  runs  are  given  by 
vertical  level  /  and  L  be  the  number  of  the  total  vertical 


levels  (L  =  60).  The  basinwide  RMSE  and  BIAS  for  the  U  l 

AU12|  assimilation  run  (-Eassim,  5assim)  and  nonassimilation  run  irn(xn,  t)  =  \/  {  X  [cfl(x„>  0  “  c,(x«’  zi>  0]  > 


(£non,  «,»,„)  are  given  by 


(39a) 


11  £  1 


N, 


£assim«  =  \  /  f  X  <J  ^  X  [cfl(Xf ,  Z,,  0  “  cfa,  Z,,  t)f  J  , 

(37a) 


l  i=i  i  y  /= i 


£non(V  0  =  \ll  t  KonK’  Z/’  0  “  CMn’Zl >  Of  - 

(39b) 


1 1  ^  f  ! 


^nonW  \j  £  X  [Cnon(Xi’  Zl’  0  Cr(Xi’Z/>0]  -®assim(Xn’^  2a  [cfl(Xn>  Z/,  0  Ct(Xn’ZZ’0]> 


(37b) 


(40a) 


1  A  ll 


v, 


5assim (0  =  1  1 1  ^  I  Mxi’  2/-  0  ~  ct(xP  0]  f > 


(38a) 


1  L 

Bnon (*»>  0  =  l  X  [Cnon(Xn’ZZ>  0  “  Cf(X«’  Z/’  0]  • 


Z=1 


(40b) 


Bnon(0  =  [l|^  X  [Cnon(XP  ZV  <)  ~  C,(\’  ZV  0]  j  • 

(38b) 

Figure  7  shows  the  comparison  of  the  basinwide  RMSE 
and  BIAS  of  the  model  between  without  data  assimila¬ 
tion  (dashed  curve)  and  with  the  OSD  data  assimilation 
(solid  curve).  RMSE  increases  from  0.50°C  at  day  1  to 
0.52°C  at  day  20  (0.02°C  increase)  without  data  assimi¬ 
lation  (4%  of  error  increase),  and  it  decreases  from 
0.50°C  at  day  1  to  0.43°C  at  day  20  with  the  OSD  data 
assimilation  (14%  of  error  decrease).  BIAS  increases 
from  0.080°C  at  day  1  to  0.082°C  at  day  20  (0.002°C  in¬ 
crease)  without  data  assimilation  (2.5%  increase),  and  it 
decreases  from  0.08°C  at  day  1  to  0.04°C  at  day  20 
(0.04°C  decrease)  with  the  OSD  data  assimilation  (50% 
decrease). 


A  comparison  of  the  local  RMSE  (Fig.  8)  and  BIAS  [FB] 
(Fig.  9)  between  without  data  assimilation  (right  F9 
panels)  and  with  the  OSD  data  assimilation  (left 
panels)  for  day  1  (top  panels),  day  10  (middle  panels), 
and  day  20  (bottom  panels)  shows  the  strength  and 
weakness  of  the  OSD  scheme.  At  day  1,  the  local 
RMSE  and  BIAS  are  quite  comparable  between  the 
assimilated  run  (Figs.  8a  and  9a)  and  the  non- 
assimilated  run  (Figs.  8b  and  9b).  The  local  RMSE  has 
large  values  around  ~2°C  in  the  central  equatorial 
Pacific  (10°S-10°N,  160°-120°W),  in  the  eastern  tropical 
North  Pacific  (10°-18°N,  120°-90°W),  a  very  narrow 
strip  in  the  Antarctic  Circumpolar  Current  region  near 
the  ice  shelf  (south  of  68°S,  160°-90°W),  and  relatively 
low  values  elsewhere.  The  local  BIAS  has  large  values 
around  0.5  ~  1°C  in  the  most  areas  of  the  low  latitudes 
(20°S-20°N)  and  high  latitudes  (north  of  40°N)  except 
in  the  eastern  Pacific  near  coastal  regions  and  in  the 
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Fig.  8.  Comparison  of  temporally  varying  local  RMSE  for  the  (a)  assimilation  run  at  day  1,  (b)  nonassimilation 
run  at  day  1,  (c)  assimilation  run  at  day  10,  (d)  nonassimilation  run  at  day  10,  (e)  assimilation  run  at  day  20,  and 
(f)  nonassimilation  run  at  day  20. 
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Antarctic  Circumpolar  Current  region,  and  relatively 
low  values  elsewhere. 

As  the  time  progresses,  the  local  (RMSE,  BIAS)  for 
the  nonassimilate  run  remains  almost  the  same  at  day  10 
(Figs.  8d  and  9d)  and  day  20  (Figs.  8f  and  9f)  as  at  day  1 


(Figs.  8b  and  9b),  but  it  changes  evidently  for  the  as¬ 
similated  run  at  day  10  (Figs.  8c  and  9c)  and  day  20 
(Figs.  8e  and  9e)  as  compared  to  day  1  (Figs.  8a  and  9a). 

The  local  RMSE  is  reduced  drastically  north  of  20°S  with 
a  disappearance  of  high  local  RMSE  originally  (day  1)  in 
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(b)  nonassimilation  run  at  day  1,  (c)  assimilation  run  at  day  10,  (d)  nonassimilation  run  at  day  10, 
(e)  assimilation  run  at  day  20,  and  (f)  nonassimilation  run  at  day  20. 
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the  central  equatorial  Pacific,  and  the  eastern  tropical 
North  Pacific,  and  is  not  reduced  or  may  even  be  in¬ 
creased  slightly  south  of  20°S  with  the  appearance  of 
high  local  RMSE  originally  (day  1)  in  the  Antarctic 
Circumpolar  Current  region  near  the  ice  shelf  (south  of 
68°S,  160°-90°W).  It  is  noted  that  the  areas  with  large 
reduction  in  error  and  bias  are  the  observational  data- 
rich  areas,  and  that  areas  with  less  decreasing  (or  even 
increasing)  in  error  and  bias  are  the  observational  data- 
poor  areas  (cf.  Figs.  8  and  9  to  Fig.  5). 

10.  Conclusions 

The  OSD  method  has  been  developed  for  ocean  data 
assimilation  on  the  basis  of  the  classic  theory  of  the 
generalized  Fourier  series  expansion,  such  that  any 
ocean  field  is  represented  by  a  linear  combination  of  the 
products  of  basis  functions  (modes)  and  corresponding 
spectral  coefficients.  The  basis  functions  are  the  eigen¬ 
vectors  of  the  Laplacian  operator,  determined  only  by 
the  topography  with  the  same  lateral  boundary  condi¬ 
tions  for  the  assimilated  variables. 

Different  from  the  existing  ocean  data  assimilation 
methods  such  as  optimal  interpolation,  Kalman  filters, 
and  variational  methods  (originally  developed  for  at¬ 
mospheric  data  assimilation),  the  OSD  method  has  four 
specific  features:  (i)  effective  utilization  of  the  ocean 
topographic  data,  (ii)  orthonormal  and  predetermined 
basis  functions  that  are  independent  on  and  satisfy  the 
same  lateral  boundary  condition  of  the  assimilated 
variable  anomalies,  (iii)  no  requirement  of  a  priori  in¬ 
formation  on  a  background  error  covariance  matrix  (B), 
and  (iv)  optimal  mode  truncation  through  minimization 
of  the  Vapnik-Chervonkis  dimension  as  a  statistical 
learning  process.  After  the  mode  truncation,  the  model 
field  updates  due  to  innovation  through  solving  a  set  of 
a  linear  algebraic  equations  of  the  spectral  coefficients. 

The  capability  of  the  OSD  method  is  demonstrated 
through  a  twin  experiment  using  the  Parallel  Ocean 
Program  (POP)  model  for  the  Pacific  Ocean.  For  an 
objective  evaluation,  the  “observational”  data  are  not 
uniformly  distributed  in  the  data-rich  area  north  of  20°S 
and  in  the  data-poor  area  south  of  20°S.  Within  20  days, 
the  basinwide  RMSE  (BIAS)  increases  4%  (2.5%) 
without  the  OSD  data  assimilation  and  decreases  14% 
(50%)  with  the  OSD  data  assimilation.  However,  the 
improvement  using  the  OSD  data  assimilation  depends 
on  the  observational  data  distribution.  The  local  RMSE 
is  reduced  drastically  in  data-rich  areas  (i.e.,  north  of 
20°S)  but  not  in  data-poor  areas  (i.e.,  south  of  20°S). 

No  use  of  the  a  priori  B  matrix  implies  that  the  ob¬ 
servations  are  purely  extrapolated  to  the  data-poor  area 
with  the  control  of  the  observational  influence  matrix  F 


T  AL.  13 

[see  (17)  and  (27)].  Since  the  extrapolation  causes  un¬ 
predictable  analysis  errors  and  the  twin  experiment  does 
not  show  improvement  by  OSD  assimilation  in  the  data- 
poor  area,  further  studies  on  constructing  the  F  matrix 
are  needed.  Moreover,  verification  using  a  twin  experi¬ 
ment  is  just  a  first  step.  Feasibility  studies  should  be 
conducted  for  real  ocean  data  such  as  conductivity- 
temperature-depth  (CTD),  expendable  bathythermo¬ 
graph  (XBT),  Argo  profiling  data,  and  glider  data. 

The  OSD  method  proposed  here  is  two-dimensional 
and  conducted  at  each  vertical  level  with  the  basis 
functions  given  by  the  eigenvectors  of  the  horizontal 
Laplacian  operator.  This  can  be  extended  to  a  three- 
dimensional  OSD  method  with  the  basis  functions  given 
by  the  eigenvectors  of  the  three-dimensional  Laplacian 
operator,  where  much  larger  matrix  operations  will  be 
involved.  Besides,  for  the  three-dimensional  OSD,  the 
surface  boundary  conditions  of  the  assimilated  variable 
anomalies  may  vary  due  to  local  climatology.  Its  impact 
on  the  three-dimensional  basis  functions  will  be  in¬ 
vestigated  in  future  studies. 
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Naval  Oceanographic  Office,  and  the  Naval  Post¬ 
graduate  School  supported  this  study. 

APPENDIX 

Derivation  of  Lateral  Boundary  Condition  [(6)] 

Generally,  the  assimilated  ocean  variable  c  (temper¬ 
ature,  salinity,  density,  velocity, . . .)  have  the  lateral 
boundary  (T)  condition 

h^rjn  ■  Vc  +  b2(r)c  =  D(r),  (Al) 

where  D(t)  is  the  forcing  term  varying  with  r.  With  the 
inhomogeneous  boundary  condition  (Al),  the  assimi¬ 
lated  variable  c(x,  z,  t)  consists  of  two  parts, 

c(x,  z,  t)  =  c(x,  z,  t)  +  S(x,  z,  t) ,  (A2) 

where  S(x,  z,  t)  is  the  solution  of  the  Laplacian  equation 
with  the  inhomogeneous  boundary  condition 

V2S  =  0,  ^(tK  \S+k2(T)S  =  D(T)  at  T;  (A3) 

and  c(x,  z,  t)  =  c(x,  z,  t)  —  S(x,  z,  t ),  which  satisfies  the 
homogeneous  boundary  condition 

hj(T)n  ■  Vc  +  b2(r)c  =  0.  (A4) 

Since 

cb(x,  z,  t)  =  cb(x,  z,  t )  +  S(x,  z,  t) ,  (A5) 
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subtraction  of  (A5)  from  (A2)  leads  to 

Ac(x,  z,  t )  =  c(x,  z,  t)  -  cb(x,  z,  t )  =  c(x,  z,  t)  -  cfc(x,  z,  t) . 

(A6) 

Both  c(x,  z,  t)  and  q,(x,  z,  t )  satisfy  the  boundary  con¬ 
dition  (A4),  which  leads  to  the  boundary  condition  (6) 
for  Ac, 

h^r)!!  ■  V(Ac)  +  £>2(T)Ac  =  0. 
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